AITopics | optimal statistical rate

Collaborating Authors

optimal statistical rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalized Eigenvalue Problems with Generative Priors

Neural Information Processing SystemsMar-20-2026, 11:50:22 GMT

Generalized eigenvalue problems (GEPs) find applications in various fields of science and engineering. For example, principal component analysis, Fisher's discriminant analysis, and canonical correlation analysis are specific instances of GEPs and are widely used in statistical data processing. In this work, we study GEPs under generative priors, assuming that the underlying leading generalized eigenvector lies within the range of a Lipschitz continuous generative model. Under appropriate conditions, we show that any optimal solution to the corresponding optimization problems attains the optimal statistical rate. Moreover, from a computational perspective, we propose an iterative algorithm called the Projected Rayleigh Flow Method (PRFM) to approximate the optimal solution. We theoretically demonstrate that under suitable assumptions, PRFM converges linearly to an estimated vector that achieves the optimal statistical rate. Numerical results are provided to demonstrate the effectiveness of the proposed method.

artificial intelligence, optimization problem, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.85)

Add feedback

DistributedTrainingwith HeterogeneousData: BridgingMedian-andMean-BasedAlgorithms

Neural Information Processing SystemsFeb-11-2026, 03:40:40 GMT

Formanytasksinthesefields,itmaytakeweeks or even months to train a model due to the size of the model and training dataset.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Neural Information Processing SystemsDec-26-2025, 01:42:26 GMT

We analyse the learning performance of Distributed Gradient Descent in the context of multi-agent decentralised non-parametric regression with the square loss function when i.i.d.

decentralised non-parametric regression, name change, optimal statistical rate, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)

Add feedback

Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Dominic Richards, Patrick Rebeschini

Neural Information Processing SystemsAug-20-2025, 05:55:34 GMT

The presence of the threshold comes from statistics.

agent, gradient descent, iteration, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Massachusetts (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Generalized Eigenvalue Problems with Generative Priors

Neural Information Processing SystemsMay-27-2025, 00:53:05 GMT

artificial intelligence, generalized eigenvalue problem, machine learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.30)

Add feedback

Reviews: Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Neural Information Processing SystemsJan-27-2025, 14:16:23 GMT

Update: I have read the author response and appreciate that they addressed some of my comments. The focus is on obtaining statistical guarantees about the generalization. This is a highly relevant direction to the growing body of work on decentralized training. The paper is generally well written, contains very original ideas, and I was very excited to read it. The main reason I didn't give a higher rating was because of the limitations listed at the beginning of Sec 5. I commend the authors for acknowledging them.

decentralised non-parametric regression, linear speed-up, optimal statistical rate, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Reviews: Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Neural Information Processing SystemsJan-27-2025, 14:16:12 GMT

This paper provides a nice and clean characterization of a decentralized learning problem. The result is perhaps unsurprising in its form, but the analysis is far from trivial. There are some nontrivial assumptions for their results to hold which perhaps limit the scope of this result but do suggest interesting avenues for future research in this increasingly important area. Overall, this is a solid contribution and should be of interest to NeurIPS attendees who work in optimization and distributed systems.

decentralised non-parametric regression, linear speed-up, optimal statistical rate

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Neural Information Processing SystemsOct-11-2024, 02:37:22 GMT

We analyse the learning performance of Distributed Gradient Descent in the context of multi-agent decentralised non-parametric regression with the square loss function when i.i.d. We show that if agents hold sufficiently many samples with respect to the network size, then Distributed Gradient Descent achieves optimal statistical rates with a number of iterations that scales, up to a threshold, with the inverse of the spectral gap of the gossip matrix divided by the number of samples owned by each agent raised to a problem-dependent power. The presence of the threshold comes from statistics. It encodes the existence of a "big data" regime where the number of required iterations does not depend on the network topology. In this regime, Distributed Gradient Descent achieves optimal statistical rates with the same order of iterations as gradient descent run with all the samples in the network.

decentralised non-parametric regression, linear speed-up, optimal statistical rate, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Collaborative Learning with Shared Linear Representations: Statistical Rates and Optimal Algorithms

Niu, Xiaochun, Su, Lili, Xu, Jiaming, Yang, Pengkun

arXiv.org Machine LearningSep-7-2024

Collaborative learning enables multiple clients to learn shared feature representations across local data distributions, with the goal of improving model performance and reducing overall sample complexity. While empirical evidence shows the success of collaborative learning, a theoretical understanding of the optimal statistical rate remains lacking, even in linear settings. In this paper, we identify the optimal statistical rate when clients share a common low-dimensional linear representation. Specifically, we design a spectral estimator with local averaging that approximates the optimal solution to the least squares problem. We establish a minimax lower bound to demonstrate that our estimator achieves the optimal error rate. Notably, the optimal rate reveals two distinct phases. In typical cases, our rate matches the standard rate based on the parameter counting of the linear representation. However, a statistical penalty arises in collaborative learning when there are too many clients or when local datasets are relatively small. Furthermore, our results, unlike existing ones, show that, at a system level, collaboration always reduces overall sample complexity compared to independent client learning. In addition, at an individual level, we provide a more precise characterization of when collaboration benefits a client in transfer learning and private fine-tuning.

assumption 3, estimator, learning, (14 more...)

arXiv.org Machine Learning

2409.04919

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

How Projected Gradient Descent works in Machine Learning pipelines part1

#artificialintelligenceFeb-5-2023, 07:25:10 GMT

Abstract: This paper addresses a distributed convex optimization problem with a class of coupled constraints, which arise in a multi-agent system composed of multiple communities modeled by cliques. First, we propose a fully distributed gradient-based algorithm with a novel operator inspired by the convex projection, called the clique-based projection. Next, we scrutinize the convergence properties for both diminishing and fixed step sizes. For diminishing ones, we show the convergence to an optimal solution under the assumptions of the smoothness of an objective function and the compactness of the constraint set. Additionally, when the objective function is strongly monotone, the strict convergence to the unique solution is proved without the assumption of compactness.

assumption, gradient descent work, machine learning pipeline part1, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.45)

Add feedback